Improved modeling of clinical data with kernel methods

نویسندگان

  • Anneleen Daemen
  • Dirk Timmerman
  • Thierry Van den Bosch
  • Cecilia Bottomley
  • Emma Kirk
  • Caroline Van Holsbeke
  • Lil Valentin
  • Tom Bourne
  • Bart De Moor
چکیده

OBJECTIVE Despite the rise of high-throughput technologies, clinical data such as age, gender and medical history guide clinical management for most diseases and examinations. To improve clinical management, available patient information should be fully exploited. This requires appropriate modeling of relevant parameters. METHODS When kernel methods are used, traditional kernel functions such as the linear kernel are often applied to the set of clinical parameters. These kernel functions, however, have their disadvantages due to the specific characteristics of clinical data, being a mix of variable types with each variable its own range. We propose a new kernel function specifically adapted to the characteristics of clinical data. RESULTS The clinical kernel function provides a better representation of patients' similarity by equalizing the influence of all variables and taking into account the range r of the variables. Moreover, it is robust with respect to changes in r. Incorporated in a least squares support vector machine, the new kernel function results in significantly improved diagnosis, prognosis and prediction of therapy response. This is illustrated on four clinical data sets within gynecology, with an average increase in test area under the ROC curve (AUC) of 0.023, 0.021, 0.122 and 0.019, respectively. Moreover, when combining clinical parameters and expression data in three case studies on breast cancer, results improved overall with use of the new kernel function and when considering both data types in a weighted fashion, with a larger weight assigned to the clinical parameters. The increase in AUC with respect to a standard kernel function and/or unweighted data combination was maximum 0.127, 0.042 and 0.118 for the three case studies. CONCLUSION For clinical data consisting of variables of different types, the proposed kernel function--which takes into account the type and range of each variable--has shown to be a better alternative for linear and non-linear classification problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)

Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes.  Small area estimation is needed  in obtaining information on a small area, such as sub-district or village.  Generally, in some cases, small area estimation uses parametric modeling.  But in fact, a lot of models have no linear relationship between the small area average and the covariat...

متن کامل

Ensemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search

In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...

متن کامل

An interior-point algorithm for $P_{ast}(kappa)$-linear complementarity problem based on a new trigonometric kernel function

In this paper, an interior-point algorithm  for $P_{ast}(kappa)$-Linear Complementarity Problem (LCP) based on a new parametric trigonometric kernel function is proposed. By applying strictly feasible starting point condition and using some simple analysis tools, we prove that our algorithm has $O((1+2kappa)sqrt{n} log nlogfrac{n}{epsilon})$ iteration bound for large-update methods, which coinc...

متن کامل

Online learning of positive and negative prototypes with explanations based on kernel expansion

The issue of classification is still a topic of discussion in many current articles. Most of the models presented in the articles suffer from a lack of explanation for a reason comprehensible to humans. One way to create explainability is to separate the weights of the network into positive and negative parts based on the prototype. The positive part represents the weights of the correct class ...

متن کامل

Kernel Ridge Estimator for the Partially Linear Model under Right-Censored Data

Objective: This paper aims to introduce a modified kernel-type ridge estimator for partially linear models under randomly-right censored data. Such models include two main issues that need to be solved: multi-collinearity and censorship. To address these issues, we improved the kernel estimator based on synthetic data transformation and kNN imputation techniques. The key idea of this paper is t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artificial intelligence in medicine

دوره 54 2  شماره 

صفحات  -

تاریخ انتشار 2012